Mind-reading machines: automated inference of complex mental states
نویسنده
چکیده
People express their mental states all the time, even when interacting with machines. These mental states shape the decisions that we make, govern how we communicate with others, and affect our performance. The ability to attribute mental states to others from their behaviour, and to use that knowledge to guide one’s own actions and predict those of others is known as theory of mind or mind-reading. The principal contribution of this dissertation is the real time inference of a wide range of mental states from head and facial displays in a video stream. In particular, the focus is on the inference of complex mental states: the affective and cognitive states of mind that are not part of the set of basic emotions. The automated mental state inference system is inspired by and draws on the fundamental role of mind-reading in communication and decision-making. The dissertation describes the design, implementation and validation of a computational model of mind-reading. The design is based on the results of a number of experiments that I have undertaken to analyse the facial signals and dynamics of complex mental states. The resulting model is a multi-level probabilistic graphical model that represents the facial events in a raw video stream at different levels of spatial and temporal abstraction. Dynamic Bayesian Networks model observable head and facial displays, and corresponding hidden mental states over time. The automated mind-reading system implements the model by combining top-down predictions of mental state models with bottom-up vision-based processing of the face. To support intelligent human-computer interaction, the system meets three important criteria. These are: full automation so that no manual preprocessing or segmentation is required, real time execution, and the categorization of mental states early enough after their onset to ensure that the resulting knowledge is current and useful. The system is evaluated in terms of recognition accuracy, generalization and real time performance for six broad classes of complex mental states—agreeing, concentrating, disagreeing, interested, thinking and unsure, on two different corpora. The system successfully classifies and generalizes to new examples of these classes with an accuracy and speed that are comparable to that of human recognition. The research I present here significantly advances the nascent ability of machines to infer cognitive-affective mental states in real time from nonverbal expressions of people. By developing a real time system for the inference of a wide range of mental states beyond the basic emotions, I have widened the scope of human-computer interaction scenarios in which this technology can be integrated. This is an important step towards building socially and emotionally intelligent machines. Acknowledgements This dissertation was inspired and shaped through discussions with many people and was only made possible by the support of many more who have shared their love, prayers, time, and know-how with me. Without the support of each and every one of them it would not have been possible to undertake and successfully complete this challenging endeavour. I would like to especially thank my supervisor, Peter Robinson, for giving me the opportunity to be part of this inspirational environment and for being supportive in every imaginable way. I would also like to thank Alan Blackwell for his valuable advice and for involving me with Crucible through which I have learnt tremendously, Neil Dodgson for his technical and moral support and Rosalind Picard and Sean Holden for their insightful comments on the dissertation. Simon Baron-Cohen has and continues to provide the inspiration for many of the ideas in this dissertation. I would like to thank him and his group at the Autism Research Centre, especially Ofer Golan for sharing their thoughts, research and the Mind Reading DVD with us. Tony Manstead, Alf Linney, Alexa Wright and Daren McDonald for fruitful discussions on emotions and facial expressions. Special thanks are due to everyone who has volunteered in taking part in the various studies and experiments which I have undertaken, and especially those who have volunteered with their acting at the CVPR 2004 conference. I would like to thank the former and current members of the Rainbow Group and other groups at the Computer Laboratory, especially Maja Vukovic, Mark Ashdown, Eleanor Toye, Jennifer Rode, Scott Fairbanks, Evangelos Kotsovinos and Chris Town for reviewing this dissertation at its various stages, and William Billingsley, Marco Gillies, Michael Blain and Tal Sobol-Shikler for their support. I am also grateful to the Computer Laboratory staff for their timely assistance on technical and administrative issues and for their daily encouragement, especially Lise Gough, Graham Titmus, Jiang He, Chris Hadley, Margarett Levitt, Amanda Hughes and Kate Ellis. At Newnham College, I would like to thank Katy Edgecombe and Pam Hirsch for the support they have shown me both at the research and at the personal level. I would like to thank my “family” in Cambridge, especially Carol Robinson for making me feel part of her family and for taking care of Jana on many occasions, Catherine Gibbs, Helen Blackwell, Rachel Hewson and Amna Jarar for being there for me. I would like to thank my friends, especially Ingy Attwa for faithfully keeping in touch, and Dahlia Madgy for help with technical writing. I have been blessed with the most supportive family, to whom I am very grateful. My parents-in-law Ahmed and Laila for never leaving me alone, and Houssam and Sahar for being supportive. My cousin Serah and my sisters Rasha and Rola for being wonderful and for help with reviewing everything I have written. My parents Ayman and Randa for always being there for me and for believing in me. Mom and dad, I wouldn’t be where I am without your hard work, commitment and devotion; there is little I can say to thank you enough. Jana, thank you for being my sunshine and for putting up with a mom who spends most on her time on the “puter”. Wael, thank you for helping me pursue this lifetime dream and for being the most supportive husband, best friend, advisor and mentor anyone can wish for. This research was generously funded by the Cambridge Overseas Trust, British Petroleum, the Overseas Research Student Award, the Computer Laboratory’s Neil Wiseman fund, the Marrion-Kennedy Newnham College Scholarship, and the Royal Academy of Engineering.
منابع مشابه
Mind Reading Machines : Automated Inference of Cog nit ive Ment a 1 States from Video
Mind reading encompasses OUT ability to attribute mental states to others, and is essential for operating in a complex social environment. The goal in building mind reading machines is to enable computer technologies to understand and react to people's emotions and mental states. This paper describes a system for the automated inference of cognitive mental states from observed facial expression...
متن کاملTheory of Mind in Intelligent User Interfaces
The ability to read the mind and the language of the eyes are two cornerstones for the development of social function and emotional intelligence in humans [1]. Mind reading is the ability to infer other people’s mental state and use that to make sense of and predict their behavior. A lack of or impairment in the theory of mind (mindblindness) is thought to be the primary inhibitor of emotion un...
متن کاملMonte Carlo Simulation to Compare Markovian and Neural Network Models for Reliability Assessment in Multiple AGV Manufacturing System
We compare two approaches for a Markovian model in flexible manufacturing systems (FMSs) using Monte Carlo simulation. The model which is a development of Fazlollahtabar and Saidi-Mehrabad (2013), considers two features of automated flexible manufacturing systems equipped with automated guided vehicle (AGV) namely, the reliability of machines and the reliability of AGVs in a multiple AGV jobsho...
متن کاملTheory of Mind
Theory of mind (ToM) is the intuitive understanding of one's own and other people's minds or mental states— including thoughts, beliefs, perceptions, knowledge, intentions, desires, and emotions—and of how those mental states influence behavior. Sometimes called intuitive psychology, folk psychology, or even mind-reading, ToM is an innate human ability. The understanding that others have mental...
متن کاملAutomated Classification of Complex Mental States from Head and Facial Displays in Video
This paper presents a system for the automated inference of complex mental states from observed facial expressions and head gestures in video. Head actions are identified from image-based pose estimation. Facial actions, including asymmetric ones, are extracted from motion, color and shape parameters. A multi-level dynamic Bayesian network classifier models complex mental states as a sequence o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005